Indoor scenes typically exhibit complex, spatially-varying appearance from global illumination, making inverse rendering a challenging ill-posed problem. This work presents an end-to-end, learning-based inverse rendering framework incorporating differentiable Monte Carlo raytracing with importance sampling. The framework takes a single image as input to jointly recover the underlying geometry, spatially-varying lighting, and photorealistic materials. Specifically, we introduce a physically-based differentiable rendering layer with screen-space ray tracing, resulting in more realistic specular reflections that match the input photo. In addition, we create a large-scale, photorealistic indoor scene dataset with significantly richer details like complex furniture and dedicated decorations. Further, we design a novel out-of-view lighting network with uncertainty-aware refinement leveraging hypernetwork-based neural radiance fields to predict lighting outside the view of the input photo. Through extensive evaluations on common benchmark datasets, we demonstrate superior inverse rendering quality of our method compared to state-of-the-art baselines, enabling various applications such as complex object insertion and material editing with high fidelity. Code and data will be made available at \url{https://jingsenzhu.github.io/invrend}.
translated by 谷歌翻译
在本文中,我们提出了一个两阶段优化策略,用于解决名为CCPNRL-GA的大规模旅行推销员问题(LSTSP)。首先,我们假设一个表现出色的人作为精英的参与可以加速优化的收敛性。基于这一假设,在第一阶段,我们将城市聚集并将LSTSP分解为多个子组件,并使用可重复使用的指针网络(PTRNET)优化每个子组件。在亚组件优化之后,我们将所有子巡回仪组合在一起以形成有效的解决方案,该解决方案将与GA的第二阶段相连。我们验证了我们对10个LSTSP的建议的绩效,并将其与传统EAS进行比较。实验结果表明,精英个人的参与可以极大地加速LSTSP的优化,而我们的建议在处理LSTSP方面具有广泛的前景。
translated by 谷歌翻译
给定数千种同样准确的机器学习(ML)模型,用户如何在其中选择?最近的ML技术使领域专家和数据科学家能够为稀疏决策树生成完整的Rashomon设置,这是一套几乎最理想的可解释的ML模型。为了帮助ML从业者识别具有此Rashomon集合中理想属性的模型,我们开发了Timbertrek,这是第一个交互式可视化系统,该系统总结了数千个稀疏决策树的规模。两种用法方案突出了Timbertrek如何使用户能够轻松探索,比较和策划与域知识和价值观保持一致的模型。我们的开源工具直接在用户的计算笔记本和Web浏览器中运行,从而降低了创建更负责任的ML模型的障碍。Timbertrek可在以下公共演示链接中获得:https://poloclub.github.io/timbertrek。
translated by 谷歌翻译
在任何给定的机器学习问题中,可能有许多模型可以很好地解释数据。但是,大多数学习算法仅返回这些模型中的一种,使从业者没有实用的方法来探索替代模型,这些模型可能具有超出损失函数中可以表达的内容的理想属性。 Rashomon集是所有这些几乎最佳模型的集合。 Rashomon集可能非常复杂,尤其是对于高度非线性功能类,允许复杂的交互项,例如决策树。我们提供了第一种完全列举稀疏决策树的Rashomon设置的技术;实际上,我们的工作提供了针对高度非线性离散功能类别的非平凡问题的所有Rashomon设置的首次列举。这使用户可以在所有近似同样好的模型中对模型选择的前所未有的控制水平。我们在专门的数据结构中表示Rashomon集,该数据结构支持有效的查询和采样。我们显示了Rashomon集的三个应用:1)它可用于研究一组几乎最佳树的重要性(与一棵树相对),2)Rashomon设置的精确度使Rashomon集可以枚举Rashomon集合。平衡的精度和F1得分,以及3)完整数据集的Rashomon集可以用于生产仅使用数据集的子集构建的Rashomon集。因此,我们能够检查新镜头问题的Rashomon集合,使用户能够选择模型,而不是受到仅产生单个模型的算法的摆布。
translated by 谷歌翻译
许多优化问题都遭受噪声的困扰,基于非线性检查的分解方法(例如,差异分组)将完全无法检测到乘法噪声环境中变量之间的相互作用,因此,很难分解大型优化问题(LSOPS)嘈杂的环境。在本文中,我们提出了一个自动随机分组(ARG),该分组不需要用户指定的任何明确的超参数。仿真实验和数学分析表明,ARG可以检测没有适应性景观知识的变量之间的相互作用,而由ARG分解的子问题具有较小的尺度,这使EAS更容易优化。基于合作协调(CC)框架,我们引入了一个名为“修改差异进化”的高级优化器,其基于距离的选择(MDE-DS),以增强噪声环境中的搜索能力。与规范的DE相比,参数自我适应,多样化和强化之间的平衡以及基于距离的概率选择endow endow endow mde-ds具有更强的勘探和剥削能力。为了评估我们的提案的绩效,我们根据CEC2013 LSGO Suite设计了$ 500 $ -D和$ 1000 $ -D的问题。数值实验表明,我们的建议在嘈杂的环境中解决LSOP的前景广泛,并且很容易扩展到更高维度的问题。
translated by 谷歌翻译
在本文中,我们提出了一个简单的策略,可以通过平均估计精英子人群来估计收敛点。基于这个想法,我们得出了两种方法,它们是普通的平均策略和加权平均策略。我们还设计了一个具有估计收敛点的平均值的高斯采样算子,具有一定的标准偏差。该操作员与传统的差分进化算法(DE)结合使用,以加速收敛。数值实验表明,我们的建议可以在CEC2013套件上的28个低维测试功能的大多数功能上加速DE,并且可以轻松扩展我们的建议与其他基于人群的进化算法结合使用,并简单地修改。
translated by 谷歌翻译
在本文中,我们提出了一种基于合作进化的可变分组方法,用于大规模多目标问题(LSMOPS),命名为链接测量最小化(LMM)。对于子问题优化阶段,提出了基于估计收敛点的高斯采样算子的混合NSGA-II。根据我们先前的研究,在变量分组阶段中,我们将可变分组问题视为组合优化问题,并且链接测量函数的设计基于非线性检查真实代码(LINC-R)的链接识别。我们将此变量分组方法扩展到LSMOPS。在子问题优化阶段,我们假设在帕累托前(PF)周围现有更好的解决方案的可能性更高。基于这一假设,我们估计每一代优化的收敛点,并在收敛点围绕收敛点进行高斯采样。具有良好客观价值的样本将参与优化作为精英。数值实验表明,我们的变量分组方法比某些流行的变量分组方法更好,并且混合NSGA-II具有多目标问题优化的广泛前景。
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
We aim to bridge the gap between our common-sense few-sample human learning and large-data machine learning. We derive a theory of human-like few-shot learning from von-Neuman-Landauer's principle. modelling human learning is difficult as how people learn varies from one to another. Under commonly accepted definitions, we prove that all human or animal few-shot learning, and major models including Free Energy Principle and Bayesian Program Learning that model such learning, approximate our theory, under Church-Turing thesis. We find that deep generative model like variational autoencoder (VAE) can be used to approximate our theory and perform significantly better than baseline models including deep neural networks, for image recognition, low resource language processing, and character recognition.
translated by 谷歌翻译
Interview has been regarded as one of the most crucial step for recruitment. To fully prepare for the interview with the recruiters, job seekers usually practice with mock interviews between each other. However, such a mock interview with peers is generally far away from the real interview experience: the mock interviewers are not guaranteed to be professional and are not likely to behave like a real interviewer. Due to the rapid growth of online recruitment in recent years, recruiters tend to have online interviews, which makes it possible to collect real interview data from real interviewers. In this paper, we propose a novel application named EZInterviewer, which aims to learn from the online interview data and provides mock interview services to the job seekers. The task is challenging in two ways: (1) the interview data are now available but still of low-resource; (2) to generate meaningful and relevant interview dialogs requires thorough understanding of both resumes and job descriptions. To address the low-resource challenge, EZInterviewer is trained on a very small set of interview dialogs. The key idea is to reduce the number of parameters that rely on interview dialogs by disentangling the knowledge selector and dialog generator so that most parameters can be trained with ungrounded dialogs as well as the resume data that are not low-resource. Evaluation results on a real-world job interview dialog dataset indicate that we achieve promising results to generate mock interviews. With the help of EZInterviewer, we hope to make mock interview practice become easier for job seekers.
translated by 谷歌翻译